Rates of Bootstrap Approximation for Eigenvalues in High-Dimensional PCA

نویسندگان

چکیده

In the context of principal components analysis (PCA), bootstrap is commonly applied to solve a variety inference problems, such as constructing confidence intervals for eigenvalues population covariance matrix $\Sigma$. However, when data are high-dimensional, there relatively few theoretical guarantees that quantify performance bootstrap. Our aim in this paper analyze how well can approximate joint distribution leading sample $\hat\Sigma$, and we establish non-asymptotic rates approximation with respect multivariate Kolmogorov metric. Under certain assumptions, show achieve dimension-free rate ${\tt{r}}(\Sigma)/\sqrt n$ up logarithmic factors, where ${\tt{r}}(\Sigma)$ effective rank $\Sigma$, $n$ size. From methodological standpoint, our work also illustrates applying transformation $\hat\Sigma$ before bootstrapping an important consideration high-dimensional settings.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimax Rates of Estimation for Sparse PCA in High Dimensions

We study sparse principal components analysis in the high-dimensional setting, where p (the number of variables) can be much larger than n (the number of observations). We prove optimal, non-asymptotic lower and upper bounds on the minimax estimation error for the leading eigenvector when it belongs to an lq ball for q ∈ [0, 1]. Our bounds are sharp in p and n for all q ∈ [0, 1] over a wide cla...

متن کامل

Robust PCA for High-Dimensional Data

We consider the dimensionality-reduction problem for a contaminated data set in a very high dimensional space, i.e., the problem of finding a subspace approximation of observed data, where the number of observations is of the same magnitude as the number of variables of each observation, and the data set contains some outlying observations. We propose a High-dimension Robust Principal Component...

متن کامل

Influential Features Pca for High Dimensional Clustering

We consider a clustering problem where we observe feature vectors Xi ∈ R, i = 1, 2, . . . , n, from K possible classes. The class labels are unknown and the main interest is to estimate them. We are primarily interested in the modern regime of p n, where classical clustering methods face challenges. We propose Influential Features PCA (IF-PCA) as a new clustering procedure. In IF-PCA, we select...

متن کامل

PCA learning for sparse high-dimensional data

– We study the performance of principal component analysis (PCA). In particular, we consider the problem of how many training pattern vectors are required to accurately represent the low-dimensional structure of the data. This problem is of particular relevance now that PCA is commonly applied to extremely high-dimensional (N 5000–30000) real data sets produced from molecular-biology research p...

متن کامل

Important Features PCA for high dimensional clustering

We consider a clustering problem where we observe feature vectors Xi ∈ R, i = 1, 2, . . . , n, from K possible classes. The class labels are unknown and the main interest is to estimate them. We are primarily interested in the modern regime of p n, where classical clustering methods face challenges. We propose Important Features PCA (IF-PCA) as a new clustering procedure. In IFPCA, we select a ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Statistica Sinica

سال: 2024

ISSN: ['1017-0405', '1996-8507']

DOI: https://doi.org/10.5705/ss.202021.0158